Search CORE

37 research outputs found

Legal Interoperability Issues in the Framework of the OpenMinTeD Project: A Methodological Overview

Author: Labropoulou Penny
Margoni Thomas
Piperidis Stelios
Publication venue
Publication date: 01/01/2016
Field of study

This paper is a first analysis of the legal interoperability issues in the framework of the OpenMinTeD (OMTD) project (www.openminted.eu), which aims to create an open, service-oriented e-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content. The paper offers an overview into the methods for achieving such interoperability

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Enlighten

A Legal Perspective on Training Models for Natural Language Processing

Author: Dore Giulia
Eckart de Castilho Richard
Gurevych Iryna
Labropoulou Penny
Margoni Thomas
Publication venue
Publication date: 01/01/2018
Field of study

A significant concern in processing natural language data is the often unclear legal status of the input and output data/resources. In this paper, we investigate this problem by discussing a typical activity in Natural Language Processing: the training of a machine learning model from an annotated corpus. We examine which legal rules apply at relevant steps and how they affect the legal status of the results, especially in terms of copyright and copyright-related rights

TUbiblio

Enlighten

LexMeta model za leksičke resurse: teorija i primjena

Author: Gkirtzou Katerina
Klaes Christine
Labropoulou Penny
Lindemann David
Publication venue: Institute of Croatian Language and Linguistics
Publication date: 01/01/2023
Field of study

This paper presents LexMeta, a metadata model for the description of lexical resources, such as dictionaries, word lists, glossaries, etc., to be used in language data catalogues mainly targeting the lexicographic and broader humanities communities but also users exploiting such resources in their research and applications. A comparative review of similar models is made in order to show the differences and commonalities with LexMeta. To enhance semantic interoperability and support the exchange of (meta)data across disciplinary and general catalogues, the most influential models for our purposes, namely FRBR (used in library catalogues) and META-SHARE (used for language resources), are selected as a base for the design of LexMeta. We discuss how these models are aligned and extended with new properties as required for the description of lexical resources. The formal representation of the model following the Linked Data paradigm aims to further enhance the semantic interoperability. The choice to implement it in two formats (as an RDF/OWL and as a Wikibase ontology) facilitates its adoption and hence its enrichment, yet poses challenges as to their synchronisation, which are addressed through automatic workflows. We conclude with ongoing and planned activities for the improvement of the model.Rad opisuje LexMeta, metapodatkovni model za opis leksičkih resursa kao što su rječnici, popisi riječi, glosari i dr., koji će se upotrebljavati u katalozima podataka namijenjenima leksikografskoj i široj humanističkoj zajednici, ali i korisnicima koji upotreblajvaju takve modele u istraživanjima i praktičnoj primjeni. U radu je dan usporedni pregled sličnih modela kako bi se pokazale razlike i sličnosti s LexMetom. Kako bi se poboljšala semantička interoperabilnost i podržala razmjena (meta) podataka između strukovnih i općih kataloga, kao temelj za dizajn LexMeta odabrani su najutjecajniji modeli, naime FRBR koji se upotrebljava u knjižničnim katalozima i META-SHARE koji se upotrebljava za jezične resurse. Rad donosi raspravu o tome kako su ti modeli usklađeni i prošireni novim značajkama potrebnima za opis leksičkih izvora. Formalni prikaz modela koji slijedi paradigmu povezanih podataka ima za cilj dodatno poboljšati semantičku interoperabilnost. Izbor da se implementira u dva formata (kao RDF/OWL i kao ontologija Wikibase) olakšava njegovo usvajanje, a time i obogaćivanje, ali i postavlja izazove koji se tiču sinkronizacije formata, koji se rješavaju automatskim tijekovima rada. Zaključujemo rad s opisom tekućih i planiranih aktivnosti na unapređenju modela

HRČAK - Portal of Croatian Scientific and Professional Journals

Computational morphology with OntoLex-Morph

Author: Chiarcos Christian
Gkirtzou Katerina
Khan Fahad
Labropoulou Penny
Passarotti Marco
Pellegrini Matteo
Publication venue
Publication date: 24/04/2023
Field of study

This paper describes the current status of the emerging OntoLex module for linguistic morphology. It serves as an update to the previous version of the vocabulary (Klimek et al. 2019). Whereas this earlier model was exclusively focusing on descriptive morphology and focused on applications in lexicography, we now present a novel part and a novel application of the vocabulary to applications in language technology, i.e., the rule-based generation of lexicons, introducing a dynamic component into OntoLex

OPUS Augsburg

The LexMeta Metadata Model for Lexical Resources: Theoretical and Implementation Issues

Author: Gkirtzou Katerina
Klaes Christiane
Labropoulou Penny
Lindemann David
Publication venue
Publication date: 01/01/2022
Field of study

The paper presents LexMeta, a metadata model catering for descriptions of human-readable and computational lexical resources included in library catalogues and repositories of language resources. We present the main concepts of the model, its implementation, and discuss current findings and future plans

Mykolas Romeris University Institutional Repository

A Metadata Schema for the Description ofLanguage Resources (LRs)

Author: Arranz Victoria
Francopoulo Gil
Frontini Francesca
Gavrilidou Maria
Labropoulou Penny
Mapelli Valerie
Monachini Monica
Piperidis Stelios
Publication venue: Asian Federation of Natural Language Proceesing
Publication date
Field of study

This paper presents the metadata schema for describing language resources (LRs) currently under development for the needs of META-SHARE, an open distributed facility for the exchange and sharing of LRs. An essential ingredient in its setup is the existence of formal and standardized LR descriptions, cornerstone of the interoperability layer of any such initiative. The description of LRs is granular and abstractive, combining the taxonomy of LRs with an inventory of a structured set of descriptive elements, of which only a minimal subset is obligatory; the schema additionally proposes recommended and optional elements. Moreover, the schema includes a set of relations catering for the appropriate inter-linking of resources. The current paper presents the main principles and features of the metadata schema, focusing on the description of text corpora and lexical / conceptual resources

PUblication MAnagement

Processing personal data without the consent of the data subject for the development and use of language resources

Author: Birštonas Ramunas
Calamai Silvia
Gavriilidou Maria
Kamocki Pawel
Kelli Aleksei
Labropoulou Penny
Lindén Krister
Stranák Pavel
Vider Kadri
Publication venue: 'Linkoping University Electronic Press'
Publication date: 01/01/2019
Field of study

The development and use of language resources often involve the processing of personal data. The General Data Protection Regulation (GDPR) establishes an EU-wide framework for the processing of personal data for research purposes while at the same time allowing for some flexibility on the part of the Member States. The paper discusses the legal framework for language research following the entry into force of the GDPR. In the first section, we present some fundamental concepts of data protection relevant to language research. In the second section, the general framework of processing personal data for research purposes is discussed. In the last section, we focus on the models that certain EU Member States use to regulate data processing for research purposes.Peer reviewe

Archivio della Ricerca - Università degli Studi di Siena

Helsingin yliopiston digitaalinen arkisto

Vilnius University Institutional Repository

OpenMinTeD: A Platform Facilitating Text Mining of Scholarly Content

Author: Anastasiou Lucas
Eckart de Castilho Richard
Galanis Dimitrios
Georgantopoulos Byron
Greenwood Mark
Katerina Gkirtzou
Knoth Petr
Labropoulou Penny
Lempesis Antonis
Manola Natalia
Martziou Stefania
Piperidis Stelios
Sachtouris Stavros
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/06/2018
Field of study

The OpenMinTeD platform aims to bring full text Open Access scholarly content from a wide range of providers together with Text and Data Mining (TDM) tools from various Natural Language Processing frameworks and TDM developers in an integrated environment. In this way, it supports users who want to mine scientific literature with easy access to relevant content and allows running scalable TDM workflows in the cloud

TUbiblio

Open Research Online (The Open University)

Documentation and User Manual of the META-SHARE Metadata Model

Author: Arranz Victoria
Declerck Thierry
Despiri Elina
Francopoulo Gil
Frontini Francesca
Gavrilidou Maria
Labropoulou Penny
Mapelli Val?rie
Monachini Monica
Piperidis Stelios
Publication venue
Publication date
Field of study

The current deliverable presents the META-SHARE metadata schema v1.0, as implemented in the META-SHARE XSD\u27s v1.0 released to (META-NET and PSP partners) in July 2011 for text corpora and lexical/conceptual resources and its supplement for audio corpora, tools and language descriptions (simplified/refactored version) as implemented in November. It is meant to act as a user manual, providing explanations on the model contents for LRs providers and LRs curators that wish to describe their resources in accordance to it. Work on the schema is ongoing and changes/updates to the model are constantly being made; where appropriate, some changes that are already under way are documented in this deliverable

PUblication MAnagement

OntoLex-Morph: Morphology for the Web of Data

Author: Chiarcos Christian
Gkirtzou Katerina
Ionov Maxim
Khan Anas Fahad
Labropoulou Penny
Passarotti Marco
Pellegrini Matteo
Publication venue
Publication date: 01/01/2022
Field of study

Purpose: OntoLex-Lemon is a widely used community standard for publishing lexical resources in machine-readable form, and is in fact the predominant RDF vocabulary for this purpose. With the growing popularity and increasing adoption of this model for applications in both language technology and lexicography, a number of new modules have been developed in the past year to complement the OntoLex core vocabulary and its lexicographic follow up, lexicog. In this paper, we describe the current status of the development of the OntoLex-Morph vocabulary

Mykolas Romeris University Institutional Repository